MULTIPLICATION LOGIC CIRCUIT 



CLAIM OF PRIORITY 
5 This application claims priority under 35 U.S.C. 1 19 of United Kingdom 

Application No. 0107212.3, filed March 22, 2001. 

FIELD OF THE INVENTION 
The present invention generally relates to digital electronic devices and more 
10 particularly to a multiplication logic circuit for multiplying two binary numbers. 

BACKGROUND OF THE INVENTION 
It is instrumental for many applications to have a block that adds n inputs 
together. An output of this block is a binary representation of the number of high 
15 inputs. Such blocks, called parallel counters (L. Dadda, Some Schemes for Parallel 
Multipliers, Alta Freq-34: 349-356 (1965); E. E. Swartzlander Jr., Parallel Counters, 
IEEE Trans. Comput. C-22: 1021-1024 (1973)), are used in circuits performing 
binary multiplication. There are other applications of a parallel counter, for instance, 
majority- voting decoders or RSA encoders and decoders. It is important to have an 
20 implementation of a parallel counter that achieves a maximal speed. It is known to 
use parallel counters in multiplication (L. Dadda, On Parallel Digital Multipliers, 
Alta Freq 45: 574-580 (1976)). 

A full adder is a special parallel counter with a three-bit input and a two-bit 
output. A current implementation of higher parallel counters i.e. with a bigger 
25 number of inputs is based on using full adders (C. C. Foster and F. D. Stockton, 

Counting Responders in an Associative Memory, IEEE Trans. Comput. C-20: 1580- 
1583 (1971)). In general, the least significant bit of an output is the fastest bit to 
produce in such implementation while other bits are usually slower. 
The following notation is used for logical operations: 
30 ® - Exclusive OR; 

v-OR; 
a - AND; 
1 




^-NOT. 

An efficient prior art design (Foster and Stockton) of a parallel counter uses full 
adders. A full adder, denoted FA, is a three-bit input parallel counter shown in 
figure 1. It has three inputs Xi, X 2 , X 3 , and two outputs S and C. Logical expressions 
5 for outputs are 

s = X!©x 2 ex 3 , 

C = (X,aX2)v(XiaX3)v(X 2 aX3). 
A half adder, denoted HA, is a two bit input parallel counter shown in figure 1. It has 
two inputs Xi, X2 and two outputs S and C. Logical expressions for outputs are 

10 s = x,©x 2 , 

C = XiaX 2 . 

A prior art implementation of a seven-bit input parallel counter illustrated in 
figure 2. 

15 Multiplication is a fundamental operation. Given two n-digit binary numbers 

A^2 n ~ l +A n . 2 2 n - 2 +. . .+Ai2+A 0 and B n .i2 n ' 1 +B n . 2 2 n_2 +. . .+B,2+B 0 , 
their product 

P2n-i2 2n - 1 +P 2n . 2 2 2n - 2 +...+P 1 2+Po 
may have up to 2n digits. Wallace has invented the first fast architecture for a 

20 multiplier, now called the Wallace-tree multiplier (Wallace, C. S., A Suggestion for a 
Fast Multiplier, IEEE Trans. Electron. Comput. EC-13: 14-17 (1964)). Dadda has 
investigated bit behaviour in a multiplier (L. Dadda, Some Schemes for Parallel 
Multipliers, Alta Freq 34: 349-356 (1965)). He has constructed a variety of 
multipliers and most multipliers follow Dadda's scheme. 

25 Dadda's multiplier uses the scheme in on figure 3. If inputs have 8 bits then 

64 parallel AND gates generate an array shown in figure 4. The AND gate sign a is 
omitted for clarity so that Ai aBj becomes AiBj. The rest of figure 4 illustrates array 
reduction that involves full adders (FA) and half adders (HA). Bits from the same 
column are added by half adders or full adders. Some groups of bits fed into a full 

30 adder are in rectangles. Some groups of bits fed into a half adder are in ovals. The 
result of array reduction is just two binary numbers to be added at the last step. One 
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adds these two numbers by one of the fast addition schemes, for instance, conditional 
adder or carry-look-ahead adder. 

UK patent application numbers 0019287.2 and 0101961.1 and US patent 
application number 09/637,532 and US patent application entitled "A parallel 
5 counter and a multiplication logic circuit" filed on 25 January 2001, the contents of 
all of which are hereby incorporated by reference, disclose a technique for the 
modification or deformation of the array prior to array reduction. The array 
deformation derives the benefit of reducing the depth of the array to a number 
greater than 2 n "*-l and less than or equal to 2 n -l, where n is an integer. This 
10 reduction of the maximum depth of the array enables the efficient use of parallel 
counters in the array reduction step. 

SUMMARY OF THE INVENTION 
It is an object of the present invention to provide improved multiplication 
1 5 logic circuit in which the speed of operation of the multiplication logic circuit is 
improved. 

The present inventors have realised that in the array reduction step the use of 
maximal length parallel counters can significantly reduce wiring delays present in 
the prior art array reduction logic. The inventors have also however realised that the 

20 outputs of the maximum length parallel counters experience different gate delays. 
Thus in accordance with the present invention, in addition to the use of maximal 
length parallel counters in the array reduction step, the outputs of the maximal length 
parallel counters are input to reduction logic circuits with asymmetric delays to 
ameliorate the effects of the differential delays of the output of the parallel counter 

25 circuits. 

Thus in accordance with the present invention, outputs generated from the 
maximal length parallel counter logic that experience shorter delays are input to 
reduction logic inputs which incur longer delays in the generation of the output. 
Outputs of the maximal length parallel counter logic that experiences longer delays 
30 within the parallel counter logic are input to inputs of the asymmetric reduction logic 
which experience shorter delays in the generation of the output. Thus in this way the 
overall delays through the parallel counter logic and the further reduction logic are 




balanced and the differences in delays through the parallel counter logic is 
compensated for by the further reduction logic. 

In accordance with the present invention, at least one maximal parallel 
counter is used in the array reduction step to reduce the array in one dimension by 
5 receiving all of the values in the array in one column 

In a preferred embodiment of the present invention the array is modified by 
undergoing the array deformation as disclosed in co-pending UK applications 
numbers 0019287.2 and 0101961.1, US application number 09/637,532, and US 

y 

application number 09/7^9,954, the content of which are hereby incorporated by 
10 reference. Array deformation provides the benefit of reducing the number of inputs 
for a maximal column to a number greater than 2 n l -l and less than or equal to 2 n -l, 
where n is an interger. For example, for the multiplication of two 16 bit numbers, 
the array deformation process reduces the maximal depth of the array to 15 bits in 
any given column thereby enabling 15 bit input, 4 bit output parallel counters to be 
15 used in the first reduction step to reduce the array depth to a maximum of 4 bits. For 
a 32 bit input, the array deformation step reduces the maximal height of the array to 
3 1 bits in any given column thereby enabling a 3 1 bit input, 5 output parallel counter 
to be used to provide an array of reduced depth which is a maximum of 5 bits. 
In an embodiment of the present invention, the reduction logic with 
20 asymmetric delays comprises any combination of full adders, half adders and 4 to 2 
compressors. Where a number of outputs from the parallel counters is 4 or more, 4 
to 2 compressors are preferably used to generate to 2 bit outputs. 

BRIEF DESCRIPTION OF THE DRAWINGS 
25 Embodiments of the present invention will now be described with reference 

to the accompanying drawings, in which: 

Figure 1 is a schematic diagram of a full adder and a half adder in accordance 
with the prior art; 

Figure 2 is a schematic diagram of a parallel counter using full adders in 
30 accordance with the prior art; 

Figure 3 is a diagram of the steps used in the prior art for multiplication; 
Figure 4 is a schematic diagram of the process of Figure 3 in more detail; 
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Figure 5 is a schematic diagram illustrating the structure of a generated 
deformed array in accordance with an embodiment of the present invention; 

Figure 6 is a schematic diagram illustrating the array after reduction by 
maximal length parallel counters in accordance with an embodiment of the present 
5 invention; 

Figure 7 is a diagram of the logic of a full adder showing the gate delays; 

Figure 8 is a schematic diagram of a 4 to 2 compressor constructed from full 
adders in accordance with an embodiment of the present invention; 

Figure 9 is a schematic diagram of the logic circuit for the second stage of the 
10 array reduction using 4 to 2 compressors in accordance with an embodiment of the 
present invention; and 

Figure 10 is a diagram of the logic of a 4 to 2 compressor. 



for multiplying two 16 bit binary numbers A and B is formed as a deformed array in 
accordance with the process disclosed in copending UK patent applications numbers 
0019287.2 and 0101961.1, US patent application number 09/637,532 and a US 
patent application number 09/7^9,954, the contents of which is hereby incorporated 
20 by reference. The advantage of this array over the array of the prior art as illustrated 
in Figure 4 is that the maximum number of bits in a column is smaller. In the prior 
art, for a 16 bit multiplication, a column will have 16 bits. The array of Figure 5 has 
four columns with 15 bits. 



25 counters to reduce each column from a maximum of 15 bits to 4 bits maximum as 
illustrated in Figure 6. Any conventional parallel counters can be used for reducing 
the maximal columns of 15 bits to 4 bits, although it is preferable to use the parallel 
counters disclosed in the co-pending applications identified above. 



30 gate delays. Typically 2 outputs experience 4 gate delays and 2 outputs experience 5 
gate delays. However, the use of a single logic circuit in the form of a maximal 
length parallel counter for the reduction of the array greatly reduces the wiring 



DETAILED DESCRIPTION 
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In the embodiment illustrated in Figure 5, the array generated in the process 



The first reduction step to reduce the array comprises the use of parallel 



The 4 bits output from the parallel counters will have experienced different 



5 



between circuits. There is thus a significant wiring benefit in using maximal length 
parallel counters. 

Figure 7 is a logic diagram of a full adder that illustrates the asymmetric 
nature of the circuit. Inputs A and B can comprise outputs from a maximal length 
5 parallel counter which have experience 4 gate delays and are therefore relatively 
advanced compared to the input to the circuit C which is an output from the maximal 
length parallel counter which has experienced 5 gate delays. Each gate delay in this 
example is expressed as an EXOR gate delay which is the slowest gate. And and OR 
gates are considered to have a relative delay of 0.5. Figure 7 illustrates the cumulate 

10 gate delay and as can be seen, the sum S is output with a cumulative gate delay of 6 
and a carry C is also output with a cumulative gate delay of 6. Thus the full adder 
can be used as part of the second level of array reduction in order to compensate for 
the relative gate delays of the outputs of the maximal length parallel counters in the 
first level of array reduction. 

1 5 Figure 8 is a schematic logic diagram of two adjacent 4 to 2 compressors 

each comprised of 2 full adders. The relative gate delays are illustrated to illustrate 
the asymmetric nature of the logic used as a second level of logic reduction in this 
embodiment of the present invention. 

Figure 9 illustrates a chain of 4 to 2 compressors used to receive each of 4 

20 columns of bits from the reduced array following the first level of reduction by the 
maximal length parallel counters. The output of the 4 to 2 compressors for each 
column comprises 2 bits. The 2 bits can then be added using conventional addition 
logic circuitry to generate the output binary number comprising a multiplication of 
the 2 n bit binary numbers. 

25 Figure 10 is a logic diagram of the 4 to 2 compressor in accordance with an 

embodiment of the present invention. 

Thus in this embodiment of the present invention an array is generated and 
modified by array deformation in accordance with the applicant's earlier inventive 
array modification technique. The array is reduced in two stages. The first stage is 

30 built upon the recognition that the wiring of the multiplication logic circuit can be 
reduced if a single parallel counter is used for the reduction of each column of the 
array. This however results in outputs which have suffered differential gate delays. 
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Thus the invention ameliorates this problem by using a second level of array 
reduction which uses logic circuits for which the inputs experience relative 
differential gate delays i.e., the logic circuit imposes asymmetric delays on the 
inputs. In this way the relative delays caused by the use of the maximal length 
5 parallel counters does not cause a delay in the further reduction step. 

Thus this multiplication logic circuit is highly efficient since it has reduced 
wiring and increase speed because of the balancing of the gate delays in the logic 
circuit. 

Although the present invention has been described hereinabove with 
10 reference to a specific embodiment, it will be apparent to a skilled person in the art 
that modifications lie within the spirit and scope of the present invention. 

For example, although the present invention has been described hereinabove 
with reference to a specific example in which the array is deformed before array 
reduction, the present invention is applicable to the reduction of an undeformed 
1 5 array. For example, the array can be generated using any prior art technique and can 
include the use of Booth encoding for the array generation step. 

In the present invention any prior art parallel counter logic circuit can be used 
for the first level of the array reduction. Parallel counters can be used for any 
number of the columns that need not be used for all columns. For example, for the 
20 columns with three bits, a full adder can be used. It may also be desirable for some 
columns to use full adders rather than the parallel counter. The number of columns 
reduced by the^use of parallel counters is a design choice. It is however envisaged 
that it is preferable to use parallel counters for any columns having more than 3 bits 
in the array. 

25 In accordance with the present invention, the second array reduction step can 

be implemented by any suitable logic for which there are differential delays 

experienced by the inputs in the generation of the outputs. 

Although in the present invention any form of parallel counter can be used, in 

a preferred embodiment, the parallel counters disclosed in UK patent applications 
30 numbers 0019287.2 and 0101961.1, US patent application number 09/637,532 and a 

US patent application number 09/7^9,954 are used. 
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In the present invention any conventional method can be used for the final 
step of addition of the two binary numbers in order to generate the output of the 
multiplication logic circuit. 
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